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Haken proved that every resolution refutation of the pigeonhole formula has at least exponential size. 
Groote and Zantema proved that a particular OBDD computation of the pigeonhole formula has an 
exponential size. Here we show that any arbitrary OBDD refutation of the pigeonhole formula has 
an exponential size, too: we prove that the size of one of the intermediate OBDDs is £2(1 .025"). 

1 Introduction 

The pigeonhole principle, also known as Dirichlet's box principle states that n holes can hold at most n 
objects with one object to a hole. The propositional formulas describing this principle were introduced 
by Cook and Reckhow in 1979 [5 |. The formula is a CNF parameterized by n. It is unsatisfiable, but 
after removing any single clause it becomes satisfiable, it is thus minimally unsatisfiable. 

The formula has a very simple shape, a meta argument for unsatisfiability is easily given, but standard 
techniques for proving unsatisfiability automatically run out of time for quite small values of n. There- 
fore, this formula is a good benchmark to test the efficiency of an approach for deciding (un)satisfiability. 

Also, on the theoretical side, it is the basis of many interesting results. A landmark result is that of 
Haken Q, who proved that the length of any resolution refutation of the pigeon hole formula is at least 
exponential in n. Surprisingly, Cook proved that it admits a polynomial refutation based on extended 
resolution f4]. 

An Ordered Binary Decision Diagram (OBDD), also referred as a reduced OBDD (ROBDD) or just 
a BDD, is a data structure that is used to represent Boolean functions ll2l [T2l . 

OBDDs have some interesting properties: they provide compact and canonic representations of 
Boolean functions, and there are efficient algorithms for performing logical operations on OBDDs. As a 
result, OBDDs have been successfully applied to a wide variety of tasks, particularly in VLSI design and 
CAD verification [9|. There are some less well-known applications as fault tree analysis UJj, Bayesian 
reasoning and product configuration. 
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of the German Excellence Initiative. 



E. Markakis, I. Mills (Eds.): 4th Athens Colloquium 
on Algorithms and Complexity (ACAC 2009) 
EPTCS 4, 2009, pp. 13-^1] doi: 10.4204/EPTCSA21 



© O. Tveretina & C. Sinz & H. Zantema 
This work is licensed under the 
[Cre ative Commons Attribution License. 



14 



An Exponential Lower Bound on OBDD Refutations for Pigeonhole Formulas 



As a propositional proof system OBDDs were studied, e.g., by Atserias et al. |[T]|. The authors 
introduce a very general proof system based on constraint propagation. OBDDs are a special case of 
this proof system. Their proof system has four rules: axiom, join, projection, and weakening. The first 
two rules, axiom and join, correspond to an application of the OBDD apply operator. Projection and 
weakening are introduced to reduce the size of intermediate OBDDs. It was shown that the OBDD proof 
system containing all four rules is strictly stronger than resolution |T| but it is still exponential [8 |. 

In our paper, by the OBDD proof of a formula (p we mean the computation of the corresponding 
OBDD using the apply -op&rntion, i.e. in terms of the above proof system from IT], we allow only 
two rules, namely axiom and join. If the formula contains n Boolean connectives, then the OBDD 
construction requires exactly n calls of apply, and the exponential blow up of the size of the proof is 
caused by the expansion of the size of the arguments. 

In H it was proved that a particular OBDD computation of the pigeonhole formula is at least expo- 
nential. On the other hand, it was proved in [31 that the pigeonhole formula admits a polynomial size 
OBDD refutation in a setting including existential quantification (i.e. including the projection rule). 

In this paper we prove that, based on the notion of OBDD refutation along the lines of 1 3 1 containing 
the classical ingredients of OBDD computation, but excluding existential quantification, we have an 
exponential lower bound for the size of OBDD refutations of the pigeonhole formula. This is much 
stronger than the result from ||6l: there, the only computation considered first computes the conjunction 
of all positive clauses, then the conjunction of all negative clauses, and finally the conjunction of these 
two. In our setting, the clauses of the pigeonhole formula may be processed in any arbitrary order. We 
show that in any OBDD refutation proof some of the intermediate OBDDs has size at least exponential in 
n. As a consequence we state that the gap between polynomial and exponential in the OBDD refutation 
framework for pigeonhole formula is caused by the rule for existential quantification. 

We start with preliminaries in Section |2l In Section [3] we prove an exponential lower bound on 
OBDD refutations for the pigeonhole formula. Finally, Section|4] contains conclusions. 

2 Preliminaries 

We consider propositional formulas in Conjunctive Normal Form (CNFs). Basic blocks for building 
CNFs are propositional variables that take the values false or true. The set of propositional variables is 
denoted by Var. A literal is either a variable x or its negation -ix. A clause is a disjunction of literals, 
and a CNF is a conjunction of clauses. In the following, for convenience, we consider clauses as sets of 
variables, and a CNF as a set of clauses. By Cls((p) we denote the set of clauses contained in a CNF (p 
and by Var((p) we denote the set of variables contained in the CNF (p. 

2.1 Ordered Binary Decision Diagrams 

An Ordered Binary Decision Diagram (OBDD) is a a rooted, directed, acyclic graph, which consists of 
decision nodes and two terminal nodes and 1. Each decision node is labeled by a propositional variable 
from Var and has two child nodes called low child and high child. The edge from a node to a low 
(high) child represents an assignment of the variable to (1). Such a structure is called ordered because 
different variables appear in the same order on all paths from the root. Therefore, OBDDs assume that 
there is a total order -< on the set of variables Var. 

A OBDD is said to be reduced if the following two rules have been applied to its graph: 1) merge 
isomorphic subgraphs; 2) eliminate any node whose two children are isomorphic. In our paper we 
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consider only reduced OBDDs. 

Given a propositional formula (p and an order on variables -<, we define the size of a OBDD B((p, -<) 
representing (p with respect to -< as the number of its internal nodes and denote it by size(B((p, -<)). 

We give a definition of a OBDD refutation adapting the definition from |3 1. 

Definition 2.1 (OBDD refutation) Given a total order on variables -<, a OBDD refutation of an un- 
satisfiable CNF (p is a sequence of OBDDs Bi ((pi, . . . , B„(<j()„, ^) such that B„(<j()„,^) is a OBDD 
representing the constant false and for each Bi{(pi, -<), I <i <n, exactly one of the following holds. 

• (Axiom) B, (<p,-, ^) represents one of the clauses C £ (p; 

• (Join) there are OBDDs Bir((pii,~<) and B,v/((p,//, ^) such that 1 < /' < /" < / and (pi = (p)/ A <p;//. 
We say that n is the length of the OBDD refutation. The size of the OBDD refutation is defined as 
If^isize(B,(<p,-,^)). 

When it is convenient, instead of B((p,-<) we write B{(p) or just B. If a OBDD B represents a CNF 
<p then by Cls(B) we mean Cls(<jp) and by Var(B) we mean Var((p). 

The size of the minimal OBDD representing a propositional formula (p for a given order on variables 
-< is described by the following structure theorem lfT0l l6l. We use B = {0, 1} to denote the set of Boolean 
constants. 

Theorem 2.2 Suppose for a given formula (p the following holds: 

• |Var((p)| = n; 

• -< is a total order on the set of variables Var((p); 

• xi,. .. ,Xk are the smallest k elements with respect to -<for some k < n; 

• AC{l,...,k}; 

• For all distinct Ic \ , 'x2 G such that x\ =x\= t for all i A there exists a~y £ W^^^ such that 

Then the size of the OBDD B((p, -<) is at least 2l^L 

The proof of the lower bound presented in Section [331 is based on Theorem 12.21 However, in order 
to obtain a lower bound we still have to solve some combinatorial problems. 

2.2 The pigeonhole formula 

The pigeonhole principle states that n holes can hold at most n objects with one object in a hole. It can 
be formulated as a set of clauses as follows. 

ji+l n 

PCn= /\i\/Pij), NC„= /\ {^PikV^Pjk) 

i=l j=l l<i<j<n+l 
\<k<n 

PHP„ = PC„ANC„ 
Now we introduce notations that will be used in the rest of the paper. Let 

pc: = A(V^7) • 

(=1 j=i 
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Hence, PC* contains the first n clauses of PC„. We represent PC* as a matrix of variables with n rows 
and n columns (the clause Vy=i^(7 corresponds to the i-th row). We denote this matrix by P. For each 
row in P there is a corresponding clause in PC* and vice versa, therefore we will refer to a row as a 
clause, and to a set of rows as a set of clauses. 

For a given total order on variables -<, we define as the set containing the [n^/2\ smallest elements 
of Var(PC*) with respect to ordering -<, and let Sy = Var(PC*)\5-<. Moreover, we define 

SI, = {Pij G Var(PHP„) | ^ max5^}, 

and 

5t =Var(PHP„)\5*_.. 

Note that S^USy = Var(PC*) and S%USt= Var(PHP„). The sets and Sy are defined in such a 
way that the difference between the sizes of these sets is at most one, but, in contrary, this does not hold 
for the sets SI, and S^. 

For each OBDD B, in a OBDD refutation of PHP„ we define 

S'^ = 5^ n Var(B,) and Sl = Var(B,-)\5!<. 

Moreover, we define 

Cls"^^(Bi) = Cls(Bi)nCls(NC„) andCls'^''^(B,) = Cls(Bi)nCls(PC„). 

3 The main result 

The proof of our lower bound is inspired by the proof of a lower bound of a particular OBDD refutation 
given in [6|. 

Lemma 3.1 Consider a matrix M = 1 < / < «, 1 < j < Let the matrix entries be colored 

equally white and black, i.e. the difference between the number of white entries and the number of black 
entries is at most one. Let m = \cn\ for c = 5 — |\/2 ~ 0.146. Then at least one of the following holds. 

• One can choose m rows, and in every of these rows a white and a black entry, such that all these 
1m entries are in different columns. 

• One can choose m columns, and in every of these columns a white and a black entry, such that all 
these 1m entries are in different rows. 

Proof Starting by the given matrix repeat the following process as long as possible. 

Choose a row in the matrix containing both a white and a black entry. Remove both the 
column containing the white entry and the column containing the black entry. Also remove 
the chosen row. 

Assume this repetition stops after k steps. If ^ > m the first property of the lemma holds and we are done. 
In the remaining case the remaining matrix consists oin — k rows with n — lk entries in each row, where 
every row either only consists of white entries or only of black entries. Assume that at least n — 1m of 
these rows are totally black. Using k <msNe. conclude that the number of black entries in this remaining 
matrix is at least 

(w — 1m){n — Ik) > (« — 1m)^ > — 

contradicting the assumption that at most half of the entries are black (possibly up to one). So at least 
n — k—{n — 1m) = 1m — k>mof these rows are totally white. By symmetry also at least m of these rows 
are totally white. As the length of these rows are n — k > n — m > m, the second property of the lemma 
is easily fulfilled. 
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By fine-tuning tiie argument tlie constant c in Lemma [3TT] can be improved. We conjecture that it also 
liolds for c = 1 — 5\/2 w 0.293. Choosing the nx n matrix in which the left upper k x ^-square is black 
for A: « and the rest is white, one observes that this value will be sharp. As our main result involves 
an exponential lower bound, we do not focus on the precise optimal value of c. 

The pigeonhole formula is an unsatisfiable CNF and, hence, the OBDD representing PHP„ is just 
a terminal node 0. Therefore, we have to show that for an arbitrary order on variables and an arbitrary 
way to combine clauses there is an intermediate OBDD of a size exponential in n. We start our proof by 
the simple observations describing some properties of intermediate OBDDs. And the following lemma 
generalizes a well-known fact about binary trees claiming the existence of subtrees with a weight lying 
between a and 2a (for any definition of "weight" as a sum of the weights of its leaves). 

Lemma 3.2 Let C be a finite set, R CC with \R\ > 2, andB\,. . . ,Bi QC a sequence with: 

1. Bi = C 

2. For each Bi (I < i < I), either Bj = 0, B,- = {c} for c € C, or Bi = Bj U B^ for some j,k with 
j <k< i. 

Then, for each a with ^ < a < 5, there is a j <l such that 

a\R\ < \BjnR\ < 2a\R\ . 

Proof We give a proof by contradiction. Suppose, for each By, either 

\BjnR\<a\R\ or \BjnR\>2a\R\ . 

As B/ n/? = Cn/? = /?, the inequality |B/ n/?| > 2fl;|/?| holds for the final element B/ of the sequence. 
On the other hand, for singletons By = {c}, we have |Byn/?| = < a\R\ for c ^ R, and |By n7?| = 1 < a\R\ 
for c G /?, as a > l/\R\. Moreover, for B,- = 0, |B;n/?| < a\R\ obviously holds. Following now the 
predecessors of B/ (via the construction by set union) in the sequence B, backwards, we finally arrive at 
an index k for which the following holds: 

• \Bkr\R\ > 2a\R\, and 

• Bk = Bk'UBk", where |B^./ nR\ < a\R\ and |B/.// n/?| < a\R\. 

AsB<.n/?= {Bk'LlBk")nR= (B^./n/?)U (B^.//n/?), and thus \BknR\ < |B^/ nR\ + \B^nnR\ < 2a\R\, we 
arrive at a contradiction to jB^ n/?| >2a\R\. 

Lemma 3.3 Suppose Bi, . . . ,Bi is a BDD refutation of PHP „ andR C Cls(PC„) with \R\ > 4. Then there 
is an i <l such that 

|/?|/4< |Cls(B,-)n/?| <2\R\/4 . 

Proof Follows from Lemma [3^ 

Let Bi , . . . , B/ is a BDD refutation of PHP„. For each / < / define 7, as the set of columns from P'^ as 
follows: 

Ji = {j€{l,...,n}\^a,b: -^Paj V -^Pbj G Cls(B,), P,y G S^, and P^y G 5^}. 

Lemma 3.4 Suppose Bi,...,B/ is a BDD refutation of PHP n for a total order on variables -<, and 
^' ^ {1; • • • ifi} with \P'\ > 4. Then there is an i < I such that 

\P'\/4<\Jir\P'\ < \p'\/2. 
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Proof Follows from Lemma |3?2l using C = {I, . . . ,n}, R = P' , a = 1/4, and 7i , . . . ,7/ for the sequence 
(S,) !<,</, for which the precondition of Lemma IJ!2] holds . as is easily checked. 

Theorem 3.5 For every order -< on the set of variables, the size of each OBDD refutation o/ PHP„ is 
n(1.025"). 

Proof Let n > 34, and Bi, . . . , B/ be a OBDD refutation of PHP,,. We prove that for an arbitrary total 
order on variables -< there is an / < / such that size(B;) > 2"(2"4V^)/4. ^jj^^^g 2(2"4^/2)/4 ^ i.025 we 
have size(B,) > 1.025" and the theorem holds. 

We apply Lemma [3?T] to the matrix representing PC*. Then one of the following holds. 

• There is a set of — 5\/2)J rows (we denote this set by R) and there is a set of 2 — \ 
entries (we denote this set by S^) such that the following holds: 

- For each r G /? there are Pra,Prh £ S'^ such that P,-a G and P^b G Sy. 

- For distinct Pab,Pcd ^S^,b^ d. 
We define 

R' = C\s{Bi)nR . 

As n > 34, \R\ = [n{^ - \V2)\ > 5, and we can apply Lemma [331 Thus we know that there is an 
/ < / such that 

\R\/4< \R'\ <2\R\/4. 

We get 

2\R'\ + \< \R\. 

For each row r G /?' we fix an entry that is in the set S^. We collect these elements in the set A. For 
each row r ^ R' we also fix an entry that is in Sy and collect these elements in the set Y. Let 

R^ = {j\3i:Pij£AUY}. 
Taking into account that 2\R' \ + 1 < |/?| we compute 

\C\5P''-\Bi)\ < («+ 1) - m - m) < (« + 1) - {(2\R'\ + 1) - \R'\) = n-\R'\. 
We denote F = Cls'"" (Bi)\'^'- By definition R' C CIs^"'(BO- Hence, we obtain 

1^1 = \C\5P'"{Bi)\-\R'\ <n-2\R'\. 

Let J = n — \R^\. Since we have chosen the set of rows R' as satisfying the conditions of Lemma 
[3?n we get \R^ =2\R'\ and 

J = n- 2\R'\ 

and 

|F| < |7|. 

For each C G /?' we fix one variable and collect these variables in the set X that the following holds. 
For distinct Pab,Pcd ^X,b^ d. This is possible because \R'\ < 
We define X^=Sl,nX and X>^ = 5^ nX. 
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We apply Lemma IZ21 on 

k = \S'^\. 

For J = 1 , . . . , ^ we define zj = 1 if Z; G A or zj G , otherwise we define zj = 0. 

Choose it, 'x' satisfying Ic ^Ic' mAxj =x'j =Zj for allzy 0A. Then there is / such that;cy' y^x'j,. 

Let 3? = (j/t+i , . . . ,ycj), where q = |Var(B,)|, be the vector defined by yj = 1 if yj G Xy and yj = 
for all yj S'y\{Y UXy). If yj G Y then we choose yj = if it is in the same row as x, and yj = 1 
otherwise. 

Hence, the subset of clauses represented by B; evaluates to Xf for the assignment (lt,'y^) and to 
x'j, for the assignment (^',3'*). 

The size of the set A is at least «(^ — ^V2) /4 by construction. Hence, by Lemma [Z2l we conclude 
that size(B;) > 2l'^l > 21^1/^^ > 2"(5-i^)/4 for sufficiently large n. 

• There is a set of [n{^ — \V2)\ columns (we denote this set by Q) and there is a set containing 
2[n{j — j\/2)J entries (we denote this set by S^) such that the following holds: 

- For each q ^ Q there are Paq,Pbq G such that G and Phq G Sy. 

- For distinct Pat, Pal G S^, aj^c. 
Suppose m = [n{j — ^-v/2)J. 

Let 

= {j \3a,b: ^P,j V ^Phj G Cls(B,) & Paj G 5^ & Pbj eSy}. 
Then, by Lemma l34l there is B, for / < / such that 

m/4 < < m/2. 

For each j G 2^ we choose -iP^j V -tPhj such that -iPay V G Cls(B,), where Paj G and 

Pbj G S>-. We collect Paj in A and P;,^- in Y. 

Let 

2'" = {a I 3j:PajeAUY}. 

Let 
Then 

e^>m/2. 

For each j G 2^ we fix Pajj,Pbjj G 52, where P^^y G S*_^ and P^^y G S^. We collect Pajj in and 
we collect Pbjj in for all j £ Q^. 
We define 

W={a\ 3b:PabeX^UXy}. 
By Lemma im all entries collected in Q'^ are from different rows. Hence, we obtain 

\w\=2m- 

Taking into account that Q'^ > m/2 we get 

>2m/2 = m 
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and since Q'' is a natural number we get 

Q^>m + l. 

We denote 

Q* = asP"'iBi)\W- 

The set of clauses Cls'"'*(B;) can contain an arbitrary subset of clauses from PC", i.e. 

1 < |Cls^^^(B,)| <n + l. 
We take into account that \Q''\ > m + 1 and compute 

Ids'""'! <(?i+l)-|e^| <{n+\)-{m+l)=n-m. 

We define J = {j \3a : P^j G Var(PHP„) & j Q}- Then 

|7| =n — \Q\ =n — m. 

Therefore, < 

For each row r G 2* we fix one entry and collect these entries in the set W. We require that the 
entries collected in X satisfy the following properties. 

- r contains at least one entry such that this entry is in one of the columns of J; 

- each column is J contains at most one fixed entry. 

Since < there is such a set W. We denote X'^ = S'_^ nX^; = S'^ nX^;W^=S'^nW 
and Wy = SyHW. We apply Lemma |2!2l on 

k=\S'^\. 

For j = 1 we define zj = I if Zj ^ AU XI, U , and we define zj = in all other cases. We 
choose , satisfying 'x and Xj = x'j = zj for all Zy A. Then there is / { 1 , . . . , fc} such 

that Xji 7^ x',. Let 

7 = iyk+u---,yq), 

where q = |Var(B,)|, be the vector defined by yj = 1 for all yj G Xy, yj G Wy. For yj G F we 
define yj = 1 if it is in the same column as Xf and yj = otherwise. We choose yj = in all 
other cases. Therefore, for each row there is an entry that is assigned to 1 and for each column 
except / and columns from the set Q'^ there is at most one entry assigned to 1 . If a column t is 
contained in the set then two entries in this column can be assigned to 1 . By construction, for 
each column t in the set 2' there is a clause -iP^/f V -'Py'V Cls(B,). Therefore, assigning P^'t and 
-'Ps"t simultaniously to 1 does not violate the satisfiability of the subformula represented by B,-. 

Hence, the subset of clauses represented by B,- evaluates to Xf for the assignment (^,3^) and to 
x'j, for the assignment {'^',~y). 

The size of the set A is at least n{^ — ^\/2) /4 by construction. Hence, by Lemma [Z2l we conclude 
that size(B,) > 2l'*l > 21^1/"^ > 2"(5-l^^)/4 for sufficiently large n. 
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4 Conclusions 

This paper improved an earlier result in which the use of the OBDD proof system is restricted, in a way 
that the proof must follow the structure of a given formula. We have shown that the OBDD proof system 
containing two rules, axiom and join, has lower bounds exponential in n on refutations for the pigeonhole 
formulas. On the other hand, it has been shown in |3| that OBDD refutations of the same formulas can 
be given of polynomial size if the projection rule is added to the above two rules. Therefore, the result 
presented in this paper implies that the projection rule is responsible for the gap between polynomial and 
exponential, just like the rule in extended resolution is responsible for a similar gap. 
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